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ABSTRACT 

This  paper  presents  the  integration  and  evaluation  of  two 
popular  camera  calibration  techniques  for  stereo  vision  system 
development  for  motion  capture.  An  integrated  calibration 
technique  for  stereo  vision  systems  has  been  developed.  To 
demonstrate  and  evaluate  this  calibration  technique,  multiple 
Wii  Remotes  (Wiimotes)  from  Nintendo  were  used  to  form 
stereo  vision  systems  to  perform  3D  motion  capture  in  real 
time.  This  integrated  technique  is  a  two-step  process:  it  first 
calibrates  the  intrinsic  parameters  of  each  camera  using 
Zhang’s  algorithm  [5]  and  then  calibrates  the  extrinsic 
parameters  of  the  cameras  together  as  a  stereo  vision  system 
using  Svoboda’s  algorithm  [9].  Computer  software  has  been 
developed  for  implementation  of  the  integrated  technique,  and 
experiments  carried  out  using  this  technique  to  perform  motion 
capture  with  Wiimotes  show  a  significant  improvement  in  the 
measurement  accuracy  over  the  existing  calibration  techniques. 

KEYWORDS:  Camera  calibration,  Stereo  vision,  Wiimote, 
Motion  capture. 

1.  INTRODUCTION 

Systems  based  on  mechanical,  magnetic,  acoustic,  inertial 
and  optical  technologies  for  motion  capture  for  interactive 
computer  graphic  simulation  and  other  applications  have  been 
explored  by  researchers  for  many  years  and  commercial 
products  are  continuously  evolving.  In  particular,  multi-camera 
systems  have  continued  to  evolve  because  of  continuously 
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decreasing  prices  of  powerful  computers  and  cameras.  Among 
them  infrared  based  optical  motion  capture  systems  (e.g. 
iotracker,  PhaseSpace,  Vicon,  ART  Gmbh,  NaturalPoint)  are 
becoming  increasingly  popular  because  of  their  higher 
precision  and  better  flexibility.  These  systems  are  less 
susceptible  to  adverse  shop  floor  conditions. 

Good  calibration  is  the  key  to  the  efficient  use  of  a  multi¬ 
camera  stereo  vision  system.  The  calibration  method  largely 
depends  on  available  resources.  Kitahara  et  al.  [1]  calibrated 
their  large-scale  multi-camera  environment  by  using  a  classic 
method  [2].  The  3D  points  were  collected  by  a  combined  use  of 
a  calibration  board  and  a  3D  laser  surveying  instrument.  The 
cost  of  the  calibration  hardware  required  for  this  method  is 
much  higher  than  the  cost  of  the  cameras.  In  comparison,  the 
calibration  technique  discussed  in  the  present  paper  requires 
minimum  investment  in  the  calibration  hardware  and  thus  is 
ideal  for  cheap  IR  cameras  like  Wii  Remotes  (Wiimotes)  used 
in  Nintendo  Wii  games. 

Many  researchers  have  successfully  dealt  with  the  problem 
of  camera  calibration  by  taking  images  from  a  2D  object 
consisting  of  a  planar  pattern  [3,  4,  5].  Most  of  the  other 
calibration  techniques  using  planar  objects  are  directly  or 
indirectly  derived  from  these  techniques.  Tsai's  method  of 
camera  calibration  is  a  classic  one  and  is  still  widely  used  in 
computer  vision,  and  there  have  been  implementations  of  this 
method  in  C/C++  and  other  programming  languages.  The  DLT- 
based  calibration  model  presented  by  Heikkilla  and  Silven  [3] 
uses  the  concepts  and  techniques  of  Melen  [6]  in 
photogrammetry,  and  its  implementation  is  available  as  a 
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MATLAB  software  package.  Zhang’s  method  [5]  is  a  newer 
one  and  it  makes  use  of  advanced  concepts  in  projective 
geometry,  and  the  implementation  of  this  method  is  also 
available  in  a  MATLAB  toolbox. 

A  detailed  comparative  study  of  the  above  three  methods 
has  been  carried  out  by  Zollner  and  Sablatnig  [7],  whose 
experimental  evaluations  indicated  that  the  overall  error  of  the 
DLT-based  estimation  is  significantly  smaller  than  Tsai’s 
method  in  the  mono-view  case,  but  the  DLT-based  method 
generates  larger  errors  than  Zhang’s  method  in  the  multi-view 
case. 

We  recently  developed  a  technique  to  integrate  multiple 
stereo  vision  systems,  each  consisting  of  2  digital  cameras,  with 
the  individual  stereo  vision  systems  calibrated  with  Zhang’s 
algorithm  [5]  to  determine  the  camera’s  intrinsic  and  extrinsic 
parameters.  These  individually  calibrated  systems  are  then 
calibrated  together  using  Horn’s  algorithm  [8]  to  determine 
their  relative  positions  and  orientations.  The  implementation  of 
this  technique  showed  that  a  single  stereo  system  with  two 
Wiimotes  offered  an  accuracy  of  2.7  mm,  while  an  integrated, 
two-stereo  system  with  two  Wiimotes  each  for  a  total  of  four 
Wiimotes  offered  an  accuracy  of  18.9  mm. 

In  the  present  paper  we  develop  and  evaluate  an  integrated 
technique  to  calibrate  multiple  cameras  together  to  form  a  low- 
cost  motion  capture  system.  This  is  a  two-stage  calibration 
technique:  first  the  intrinsic  parameters  are  determined  with 
Zhang’s  method  [5]  and  then  the  extrinsic  parameters  are 
determined  with  Svoboda’s  method  [9].  Individual  stereo  vision 
systems  could  be  further  integrated  using  Horn’s  algorithm  [8] 
to  extend  the  range  of  motion  capture.  We  have  successfully 
implemented  this  integrated  technique  and  demonstrated  it  with 
good  measurement  accuracy  with  multiple  Wiimotes. 

2.  INDIVIDUAL  AND  MULTIPLE  STEREO  VISION 
SYSTEM  CALIBRATION 

2.1  Intrinsic  and  Extrinsic  Parameters  of  a  Stereo 
Vision  System 


Figure  1 .  Pinhole  camera  model 


In  the  camera  calibration,  the  transformation  between  3D 
world  coordinates  and  2D  image  coordinates  is  determined  by 
solving  the  unknown  parameters  of  the  camera  model.  The 
perspective  projection  (i.e.,  pinhole)  camera  model  is  illustrated 
in  Fig.  1.  The  center  of  projection  is  at  the  origin  O  of  the 
camera  coordinate  system.  The  image  coordinate  system  is 
parallel  to  the  camera  coordinate  system,  with  a  distance  f 
(focal  length)  from  O  along  the  zc  axis,  which  is  the  optical  axis 
or  the  principal  axis.  The  intersection  between  the  image  plane 
and  the  optical  axis  is  the  principal  point  o.  The  u  and  v  axes  of 
the  image  plane  coordinate  system  are  parallel  to  the  x  and  y 
axes,  respectively.  The  coordinates  of  the  principal  point  in  the 
image  plane  coordinate  system  are  (u0,  v0). 

As  shown  in  Fig.  1,  let  P  be  an  arbitrary  point  located  on 
the  positive  side  of  the  zc  axis  and  p  be  its  projection  on  the 
image  plane.  The  coordinates  of  P  in  the  camera  coordinate 
system  are  (xc,  yc,  zc)  and  in  the  world  coordinates  system  is  (x, 
y,  z).  The  coordinates  of  p  in  the  image  plane  coordinate  system 
are  (u,  v),  which  are  related  to  (x,  y,  z)  by  the  following 
equation: 


where 


u 

rxi 

A 

V 

1 

=  A  [R  T] 

y 

z 

L1J 

-1- 

ax 


s 

1y 

0 


u0 

v0 

1 


(i) 


R  and  T  are  the  rotation  matrix  and  translation  vector  which 
relate  the  world  coordinate  system  to  the  camera  coordinate 
system.  R  and  T  each  consists  of  three  independent  parameters, 
which  are  the  extrinsic  parameters  of  each  camera.  A  is  the 
intrinsic  parameter  matrix,  where  the  parameter  s  represents  the 
skewness  of  the  image  in  terms  of  the  two  image  axes,  ax=f/dx 
and  ay=f/dy  are  scaling  factors  in  the  image  u  and  v  axes, 
respectively,  f  is  the  camera  focal  length,  dx  and  dy  are  the  pixel 
dimensions  in  the  x  and  y  directions,  respectively,  and  A  is  a 
scale  factor.  Determining  the  intrinsic  and  extrinsic  parameters 
is  essential  to  the  calibration  of  a  stereo  vision  system  with 
digital  cameras. 


2.2  Establishing  a  Stereo  Vision  System  with  Two 
Cameras  Using  Zhang’s  Algorithm 

To  establish  a  stereo  vision  system  with  two  cameras,  the 
relationship  between  the  two  camera  coordinate  systems  also 
needs  to  be  calibrated.  For  a  point  P  in  3D  space,  its  two 
coordinates  PL  and  PR  in  the  left  and  right  camera  coordinate 
systems  have  the  following  relationship: 


PR  -  Rs  *  PL  +  Ts  (2) 

where  Rs  is  the  rotation  matrix  and  Ts  is  the  translation  vector 
between  the  two  coordinate  frames  of  the  stereo  vision  system. 
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To  calibrate  a  stereo  vision  system  using  Zhang’s  technique 
[5],  a  calibration  plate  can  be  used.  For  example,  Fig.  2  shows 
an  experimental  setup  for  collecting  the  calibration  data  for 
Wiimotes  using  a  36  LED  calibration  plate  shown  in  Fig.  3. 
During  the  calibration  process,  the  calibration  plate  is  moved  to 
different  locations  in  the  field  of  view  of  the  Wiimotes. 
Readings  are  taken  only  when  all  the  36  LEDs  on  the 
calibration  plate  can  be  seen  by  both  Wiimotes  for  a  given 
position  of  the  calibration  plate.  In  Fig.  2,  which  has  four 
Wiimotes  mounted  on  a  ceiling  bar  of  a  10’xlO’xlO’  CAVE 
(Computer  Automated  Virtual  Environment),  two  stereo 
systems  are  formed  of  two  Wiimotes  each,  from  a  setup  of  4 
Wiimotes.  Figure  4  indicates  a  pictorial  representation  of  the  36 
LEDs  at  two  different  locations  as  seen  by  a  Wiimote  in  the 
camera  coordinate  system.  The  pixel  coordinates  of  each  LED 
at  each  fixed  location  can  be  read  from  the  Wiimote.  Together 
with  the  known  world  coordinates  of  each  LED’s  position,  the 
intrinsic  parameters  of  the  Wiimote  can  be  calculated  using  the 
Camera  Calibration  Toolbox  of  MATLAB,  which  was 
developed  by  Bouguet  [10]  based  on  Zhang’s  algorithm. 


Figure  2.  Calibration  with  Zhang’s  method  using  a  calibration 
plate  with  36  IR  LEDs 


Figure  3.  The  calibration  plate  with  36  IR  LEDs 


Figure  4.  Data  collected  during  the  calibration 


After  the  calibration  of  each  camera,  the  calibration  results 
from  the  two  cameras  can  be  used  to  calibrate  the  two-camera 
stereo  vision  system.  There  exists  a  relationship  between  the 
world  coordinate  system  and  the  camera  coordinate  system 
through  the  extrinsic  parameters,  i.e.,  the  rotation  matrix  R  and 
translation  vector  T,  as  follows: 

Pcl  =  Ri*Pi+T1,  Pcr  =  R2*P2+T2  (3) 

where  Pcl  and  P Cr  represent  the  point’s  3D  coordinates  in  the 
left  camera  frame  and  in  the  right  camera  frame,  respectively, 
Pi  and  P2  are  two  points  expressed  in  the  world  coordinate 
system,  Ri  and  Ti  are  the  calibration  result  of  extrinsic 
parameters  for  the  left  camera,  and  R2  and  T2  are  the  calibration 
results  of  extrinsic  parameters  for  the  right  camera.  The  two 
points  Pi  and  P2  in  the  calibration  are  the  same  point,  thus 
Pi=P2.  From  equation  (3)  we  have  the  following  relationship: 

PCR  =  R2  *  Rf1  *  Pcl  +  (T2  -  R2  Rf1  TO  (4) 

The  above  equation  can  be  written  in  the  form  of 

Pcr  -  Rs  Pcl  +  Ts  (5) 

where  Rs  is  the  rotation  matrix  and  Ts  is  the  translation  vector 
that  represent  the  transformation  between  the  two  camera 
coordinate  systems. 

2.3  Calibration  of  Multiple  Stereo  Vision  Systems 

Often  two  or  more  stereo  vision  systems  need  to  be 
integrated  into  a  single  system  in  order  to  provide  a  larger 
coverage  volume.  In  this  case,  each  stereo  system  generates  the 
3D  coordinates  of  measured  points  with  respect  to  its  own 
coordinate  system.  To  integrate  these  vision  systems  into  one 
single  system,  it  is  necessary  to  determine  the  relative  position 
and  orientation  among  the  various  sub-stereo-vision  systems. 

Consider  a  set  of  3D  points  measured  by  two  stereo  vision 
systems,  each  having  its  local  coordinate  frame.  These  points 
can  simply  be  the  positions  occupied  by  a  marker,  which  is 
moved  randomly  in  3D  space,  as  shown  in  Fig.  5,  such  that  the 
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marker  is  seen  by  both  stereo  systems  at  each  time  the  data  is 
read. 


Figure  5.  Integration  of  two  stereo  vision  systems  using  Horn’s 
algorithm 


Suppose  there  are  n  different  positions  of  the  marker  in  3D 
space.  For  the  ith  position,  let  the  data  read  by  the  left  camera  be 
represented  as  ru  in  the  left  camera  coordinate  system,  and  the 
data  read  by  the  right  camera  be  represented  as  rr  i  in  the  right 
camera  coordinate  system. 

The  centroids  Tj  and  rr  of  the  two  sets  of  data  can  be 
calculated  in  the  left  and  the  right  coordinate  systems, 
respectively,  as  follows: 

1  n  1  n 

n  =  -^Ou)  .r-  =  -^(rr,i)  (6) 

i=l  i=l 

The  transformation  from  the  left  to  the  right  coordinate 
system  has  the  form 

7y  =  sR  ( ri)+  T0  (7) 

where  s  is  a  scale  factor,  T0  is  the  translation,  and  R(  r])  denotes 
the  rotation  of  vector  Fj. 

Considering  the  inaccuracy  of  the  camera  calibration 
results  and  the  inconsistencies  in  the  camera  hardware,  it  is 
practically  impossible  to  find  a  unique  transformation  that  maps 
the  entire  set  of  measured  coordinates  of  a  set  of  points  in  one 
coordinate  system  exactly  into  the  measured  coordinates  of 
these  points  in  the  other  coordinate  system.  The  residual  error 
can  be  written  as 

ei  =  rr,i  -  sR  (r,i)  -  T0  (8) 


The  root  mean  square  (RMS)  of  errors  for  n  data  points  can 
be  calculated  as 


The  classical  approach  of  finding  the  transformation 
parameters  s,  R  and  T0  is  to  minimize  the  sum  of  squares  of 
errors  numerically.  However,  Horn  [8]  derived  a  closed-form 
solution  to  the  least-square  problem  by  use  of  unit  quaternions 
to  represent  rotation.  This  solution  has  been  coded  into  a 
software  routine  by  Wengert  and  Bianchi  [11]  and  we  used  it  to 
determine  the  transformation  parameters  in  calibrating  two 
stereo  vision  systems  with  respect  to  each  other. 

2.4  Evaluation  Results 

Figure  6  shows  the  measurement  accuracy  of  a  single¬ 
stereo  system  consisting  of  two  Wiimotes  using  Zhang’s 
algorithm.  The  average  magnitude  of  measurement  error  is  2.7 
mm.  Figure  7  shows  the  measurement  accuracy  of  a  stereo 
system  formed  by  integrating  two  single-stereo  systems 
together  using  Horn’s  algorithm.  The  integrated  system 
consists  of  a  total  of  four  Wiimotes  and  covers  a  larger 
measurement  volume  compared  to  a  single-stereo  system  with 
two  Wiimotes.  However,  the  average  magnitude  of 
measurement  error  has  increased  to  18.9  mm. 


Performance  Evaluation  of  a  single  Stereo  System 
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Figure  6.  Measurement  accuracy  for  a  single  stereo  system  of  2 
Wiimotes 
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Figure  7.  Measurement  data  with  two  stereo  systems  of  2 
Wiimotes  each  for  a  total  of  4  Wiimotes 

3.  SVOBODA’S  ALGORITHM  FOR  CALIBRATION  OF  A 
SYSTEM  OF  MULTIPLE  (>2)  CAMERAS 

3.1  The  Multi-Camera  Self-Calibration  Technique 

Svoboda  et  al.  [9]  proposed  a  method  for  calibrating 
multiple  cameras  together.  The  minimum  number  of  cameras 
that  can  be  calibrated  together  using  this  method  is  3,  and  there 
is  no  upper  limit.  The  only  calibration  object  required  is  an  IR 
point  source.  Several  IR  LEDs  can  be  mounted  closely  together 
on  a  wand  to  form  a  uniform  IR  light  source  which  can  be  seen 
by  almost  all  of  the  cameras  in  all  directions.  The  calibration 
can  be  achieved  by  moving  an  IR  LED  through  the  work 
volume.  The  cameras  being  calibrated  do  not  have  to  see  all  of 
the  points  where  the  data  is  recorded,  i.e.,  only  sufficient 
overlap  among  the  cameras  being  calibrated  is  necessary.  This 
property  helps  in  obtaining  more  coverage  volume  per  Wiimote 
in  the  stereo  vision  system. 

Briefly,  Svoboda’s  algorithm  is  as  follows.  Let  m  be  the 
number  of  cameras  to  be  calibrated  together,  and  n  be  the 
number  of  calibration  points  recorded.  Also,  let  Xj  be  a 
calibration  point  with  homogeneous  coordinates  [xj,  Zj,  1]T  in 
the  world  coordinates,  where  j  =  l,....,n.  The  pinhole  camera 
model  in  equation  (1)  can  be  written  for  m  cameras  and  n 
calibration  points  as: 


scaling  factors  Aj.  In  the  calibration  process,  the  2D  projection 
coordinates  (u,v)  of  the  image  of  a  3D  marker  can  be  first 
obtained.  Then  the  wrongly  recorded  data  points  (outliers)  can 
be  detected  using  the  RANSAC  analysis  [12].  Then  the  scaling 
factors  Aj  can  be  determined  and  the  missing  points  can  be 
estimated  by  the  method  described  by  Martinec  and  Pajdla  [13]. 
The  projective  structures  can  be  further  optimized  using  the 
bundle  adjustment  [14],  and  the  overall  matrix  in  equation  (9) 
can  be  further  factorized  to  get  matrices  P  and  X  [15]. 

3.2  Evaluation  Results 

Figure  8  shows  our  experimental  setup  with  4 
Wiimotes  mounted  on  a  CAVE  frame  for  evaluation  of 
Svoboda’s  calibration  technique.  Two  IR  LEDs  with  a  known 
distance  apart  were  waved  through  the  work  volume  randomly 
and  their  image  data  was  collected.  The  intrinsic  and  extrinsic 
parameters  were  then  determined  and  used  to  compute  the 
position  of  each  LED  at  each  calibration  point.  Points  were 
recorded  when  the  LED  was  detected  by  at  least  three  of  the 
four  Wiimotes.  Then  the  scaling  factor  was  determined  with  a 
sufficient  number  of  observations. 


Figure  8.  Svoboda’s  algorithm  for  self-calibration  of  4 
Wiimotes 
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The  measurement  accuracy  is  shown  in  Fig.  9,  which 
shows  the  difference  between  the  actual  distance  of  the  two 
LEDs  and  the  distance  obtained  by  the  motion  capture  system 
employing  Svoboda’s  algorithm.  As  indicated  in  Fig.  9,  the 
average  magnitude  of  measurement  error  was  2.37  mm,  which 
is  significantly  smaller  than  the  18.9  mm  resulted  from  using 
the  combination  of  Zhang’s  and  Horn’s  algorithms  shown  in 
Fig.  7. 


where  P1  is  a  3x4  matrix  for  camera  and  contains  all  1 1  camera 
parameters  (5  intrinsic  and  6  extrinsic).  Thus  the  calibration 
here  involves  finding  the  camera  projection  matrices  P1  and  the 
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Figure  9.  Measurement  accuracy  for  a  stereo  vision  system  with 
4  Wiimotes  using  Svoboda’s  calibration  algorithm 

4  AN  INTEGRATED  CAMERA  CALIBRATION 
TECHNIQUE 

4.1  Decomposition  of  Camera  Matrix  into  Matrices  of 
Intrinsic  and  Extrinsic  Parameters 

As  discussed  above,  Svoboda’s  algorithm  can  be  used  to 
obtain  intrinsic  and  extrinsic  parameters  simultaneously  for  the 
cameras  in  a  stereo  vision  system.  This  algorithm  has  been 
shown  in  the  above  experimental  evaluation  to  be  more 
accurate  than  the  combination  of  Zhang’s  and  Horn’s 
algorithms  for  a  stereo  vision  system  with  four  Wiimotes. 
However,  our  experimental  evaluations  have  also  indicated  that 
the  values  of  intrinsic  parameters  obtained  from  Svoboda’s 
algorithm  deviate  more  from  the  known  values  of  the 
Wiimotes’  intrinsic  parameters  compared  with  Zhang’s 
algorithm,  especially  when  the  overlapping  coverage  volume 
between  the  4  Wiimotes  is  becoming  small.  Thus,  the  basic  idea 
behind  our  development  of  an  integrated  camera  calibration 
technique  is  decomposing  the  camera  calibration  matrix  P1  in 
equation  (9)  into  a  matrix  of  intrinsic  parameters  and  a  matrix 
of  extrinsic  parameters,  and  then  using  Zhang’s  algorithm  to 
determine  the  intrinsic  parameters  and  using  Svoboda’s 
algorithm  to  determine  the  extrinsic  parameters. 

It  can  be  shown  that  the  matrix  P1  in  equation  (9)  can  be 
decomposed  into  the  separate  matrices  of  intrinsic  and  extrinsic 
parameters  as  follows  [15]: 


p=  [A  °i  [R 

Lo  lJ  L  o  lJ 


(10) 


where  the  two  matrices  on  the  right  hand  side  of  equation  (10) 
represent  the  intrinsic  parameters  and  the  extrinsic  parameters, 
respectively.  Thus,  the  intrinsic  camera  parameters  can  be 
determined  using  Zhang’s  algorithm,  and  then  by  using  these 
values  of  intrinsic  parameters  in  equation  (9)  the  extrinsic 
parameters  of  a  system  of  cameras  can  be  determined  together 
using  Svoboda’s  algorithm.  To  integrate  the  multiple  stereo 


systems,  the  relative  positions  and  orientations  of  the  various 
stereo  systems  can  be  determined  as  discussed  in  section  2.3 
Computer  software  has  been  written  in  C#  using  Wiimote 
Library  functions  to  identify  dynamically  the  correspondences 
between  the  images  for  any  given  marker  point.  The  stereo 
vision  system  identifies  a  set  of  three  or  more  Wiimotes  that 
detect  the  marker  for  any  given  instance.  The  obtained  image 
data  is  then  used  to  solve  the  extrinsic  parameters  and  then  to 
determine  the  position  of  each  marker  point  using  equation  (9). 

4.2  Experiment  with  a  Stereo  System  with  4  Wiimotes 

In  this  experiment,  4  Wiimotes  are  calibrated  together 
using  the  integrated  calibration  technique  described  above.  A 
wand  with  two  IR  LEDs  mounted  at  a  known  distance  apart  is 
moved  randomly  in  the  space.  By  using  the  integrated 
calibration,  the  positions  of  the  two  markers  and  the  distance 
between  them  are  tracked  in  real  time.  The  measurement 
accuracy  is  shown  in  Fig.  10,  wherein  the  image  data  used  is 
the  same  as  that  in  Fig.  9. 

Performance  Evaluation  of  a  Single  Stcroo  System  vsifh  4  WiLnaotos 
Distance  Between  LEDs  195  mm 
Relative  measurement  accuracy  1 .07mm 
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Figure  10.  Measurement  accuracy  for  a  stereo  system  with  4 
Wiimote  using  the  integrated  calibration  technique 

By  determining  the  intrinsic  parameters  first  for  each  of  the 
4  Wiimotes  using  Zhang’s  method,  the  measurement  accuracy 
has  improved  from  2.37  mm  average  error  (Fig.  9)  to  1.07  mm 
average  error.  The  measurement  error  is  much  smaller  than  the 
average  error  of  18.9  mm  (Fig.  7)  resulted  from  using  the 
combination  of  Zhang’s  and  Horn’s  algorithms. 
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4.3  Experiment  with  a  Stereo  System  with  8  Wiimotes 

In  order  to  cover  a  larger  measurement  volume,  the 
integrated  calibration  technique  is  also  evaluated  for  8 
Wiimotes  which  are  arranged  such  that  each  set  of  4  Wiimotes 
forms  a  stereo  system  and  the  two  systems  are  integrated  into  a 
single  stereo  system  using  Horn’s  algorithm  to  determine  the 
relative  position  and  orientation  of  the  two  systems.  Figure  11 
shows  the  8  Wiimotes  mounted  in  the  CAVE,  with  two  stereo 
subsystems  each  consisting  of  four  Wiimotes.  Figure  12  shows 
the  measurement  accuracy  for  the  integrated  stereo  vision 
system  with  8  Wiimotes.  The  average  magnitude  of 
measurement  error  is  8.7  mm. 


Figure  1 1 .  A  stereo  vision  system  with  8  Wiimotes 
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Figure  12.  Measurement  errors  for  the  stereo  system  with  8 
Wiimotes  using  the  integrated  calibration  technique 


5  CONCLUSIONS 

This  paper  discusses  techniques  for  calibrating  multiple 
digital  cameras  to  form  a  stereo  vision  system.  The  calibration 
algorithms  discussed  include  Zhang’s,  Svoboda’s,  and  Horn’s 
algorithms.  A  new  calibration  technique  has  been  developed  by 
integrating  Zhang’s  and  Svoboda’s  algorithms.  The  basic  idea 
of  this  integrated  calibration  technique  is  first  using  Zhang’s 
algorithm  to  determine  the  intrinsic  parameters  and  then  using 
Svoboda’s  algorithm  to  determine  the  extrinsic  parameters.  This 
integrated  technique  can  be  further  used  with  Horn’s  algorithm 
for  integration  of  multiple  stereo  vision  systems  into  a  single 
stereo  system  to  increase  measurement  volume.  The  various 
techniques  are  evaluated  and  compared  on  measurement 
accuracy  for  a  stereo  vision  system  setup  with  4  Wiimotes.  It  is 
shown  that  the  measurement  accuracy  of  the  integrated 
technique  has  improved  over  Svoboda’s  technique  and  is  much 
better  than  the  measurement  accuracy  obtained  by  using  the 
combination  of  Zhang’s  and  Horn’s  algorithms.  The  Wiimote 
based  systems  implemented  are  much  cheaper  than  most  of  the 
commercially  available  motion  capture  systems,  providing 
great  economic  benefits  for  many  practical  applications.  Such  a 
distributed  stereo  vision  system  with  inexpensive  cameras 
enables  the  creation  of  a  low-cost,  wireless,  versatile  motion 
capture  system. 
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